Creating Interactive Graphs and Maps with

11.S954 Applied Data Science for Citie

Published

November 14, 2023

Overview

Today we’re going to explore a couple of new packages and how they can be used to enhance our visualizations. We will use an example to show how to use these tools effectively:

  1. plotly and adding interactivity through ggplotly
  2. Interactive maps with leaflet
  3. Introduce dashboards with flexdashboard

In terms of creating good and effective data analyses, mastering the ‘fundamentals’ is more important than pursuing ‘advanced’ tools. Step one is always to have a good understanding of the data you are representing. Interactivity tools contribute to improving interpretation, but they cannot replace proficiency in the subject matter.

# You may need to install plotly, leaflet, flexdashboard

library(tidyverse)
library(plotly)
library(leaflet)
library(sf)
library(tigris)

Airbnb data in Chicago

Inside Airbnb is a non-commercial set of data that allows you to explore how Airbnb is being used in cities around the world. Some visualizations of the Chicago Airbnb data can be found here, where you can see maps showing the type of room, activity, availability, and listings per host for Airbnb listings.

On this data downloading page, please locate the City of Chicago, right-click on the link of “listings.csv.gz”, select “Copy link address”, then paste the link to the “url” argument below. The download.file function fetches files directly from the website.

download.file(url = "http://data.insideairbnb.com/united-states/il/chicago/2023-09-12/data/listings.csv.gz", 
              destfile = "data/listings.csv.gz")

After you have the “listings.csv.gz” file in your data folder, run the following code, which reads the .gz file (gzip compressed file) and creates the dataset we’ll use today.

con <- gzfile("data/listings.csv.gz", 'rt') 
data <- con |> 
  read.csv(header = TRUE) |> 
  select(id, name, host_name, 
         neighborhood = neighbourhood_cleansed, 
         latitude, longitude, room_type, price, minimum_nights, number_of_reviews, last_review, reviews_per_month, availability_365, review_scores_rating)
close(con)

Data Cleaning

In class, we covered two important data-cleaning steps. We first removed the dollar signs from the price column to facilitate our subsequent numerical analyses. We then converted the character data in the last_review column into the date format to manage temporal-related analyses. We used the stringr package and the lubridate package, respectively.

airbnb <- data |> 
  mutate(price = str_replace(string = price, 
                             pattern = "\\$", 
                             replacement = "")) |> 
  mutate(price =  str_replace(price, ",", "")) |> 
  mutate(price = as.numeric(price))
airbnb <- airbnb |> 
  mutate(last_review = ymd(last_review)) |> 
  mutate(last_review_year = year(last_review),
         last_review_month = month(last_review))

After these two data-cleaning steps, please save a copy of this dataset to your data folder. It will make it easier for us to import it into another script shortly.

saveRDS(airbnb, "data/airbnb.rds")

Interactivity with plotly

Now we are going to recreate the ggplotly charts we’ve seen in class below. plotly is much larger than just the R universe, with implementations in Python, Julia, Javascript, and others. However, its interface with ggplot is practically easy to use and really maximizes the use of ggplot graphs.

Chart B: Median Room Price by Type

g <- airbnb |> 
  group_by(room_type) |> 
  summarise(median_price = median(price)) |> 
  ggplot() + 
    geom_col(aes(x = room_type, y = median_price), fill = "#62A0CA", width = 0.5) + 
  theme_bw()+
  labs(x = "Room Type", y = "Median Price")
  

ggplotly(g)

Chart C: Donut Chart of Last Reviews

airbnb |> 
  group_by(last_review_year) |> 
  summarise(count_review = n()) |> 
  plot_ly() |>  
  add_pie(labels = ~last_review_year, 
          values = ~count_review,
          hole = 0.6)

Introduction to leaflet interactive maps

Like many other things in the R universe, leaflet has great documentation and very nice vignette(s) to get you started. I’d recommend going through an introductory tutorial like this one, from basemaps, markers, popups to lines and shapes.

The general steps for creating a Leaflet map are very similar to the “layered” grammar of ggplot:

  1. Initiate leaflet()
    • Can add data to the individual layers as well.
  2. Add the Basemap with addProviderTiles
  3. Add geometric object(s)
    • addCircleMarkers() for points
    • addPolygons() for shapes
    • addPolylines() for lines
  4. Add a color pallete and a legend
    • colorFactor() for representing categorical variables
    • colorQuantile(), colorBin(), colorNumeric for representing numeric variables
  5. Add additional elements and layer controls
    • addLayersControl()

Work Process

Basemap

First, we initiate leaflet and add a basemap. We are adding a basemap created by CartoDB. Alternatively, there are all the other web map options from ESRI, OpenStreetMap, etc.

The setView defines the initial center and zoom level of a map, where larger zoom values result in a more detailed view.

leaflet() |>
  addProviderTiles(providers$CartoDB.Positron) |>
  setView(lng = -87.636483, lat = 41.862984,  zoom = 10) 

Circle markers

Then we can add the Airbnb listings as circle markers on the map, where we can define:

  • the circle size (radius),
  • whether to have fill (fill=TRUE), with what color (fillColor) and transparency (fillOpacity),
  • whether to have strokes (stroke=TRUE), stroke width (weight), and with what color (color) and transparency (opacity),

etc., etc….

leaflet() |>
  addProviderTiles(providers$CartoDB.Positron) |>
  setView(lng = -87.636483, lat = 41.862984,  zoom = 10) |> 
  addCircleMarkers(data = airbnb,
                   fill = TRUE,
                   fillOpacity = 0.5,
                   stroke = FALSE,
                   radius = 1) 

Color

Leaflet uses color mapping functions to assign colors to categorical variables and numeric variables, respectively. In this following line, we create a “relationship” between the categorical values in the airbnb$room_type variable and a sequential color palette RdYlGn (“red-yellow-green”).

pal <- colorFactor(palette = "RdYlGn", domain = airbnb$room_type)

Then this “relationship” pal is passed to two things:

1) fillColor = ~pal(room_type). You again encounter the use of the tilde (~) symbol, indicating that this input represents a relationship on a given variable (room type).

2) addLegend(pal = pal) so the that colors in the legend will show up accordingly.

pal <- colorFactor(palette = "RdYlGn", domain = airbnb$room_type)

leaflet() |>
  addProviderTiles(providers$CartoDB.Positron) |>
  setView(lng = -87.636483, lat = 41.862984,  zoom = 10) |> 
  addCircleMarkers(data = airbnb,
                   fillColor = ~pal(room_type),
                   fillOpacity = 1,
                   stroke = FALSE,
                   radius = 1,
                   popup = listing_popup) |> 
  addLegend(
    position = 'topright',
    pal = pal,
    values = airbnb$room_type,
    title = "Room Type"
  )

Polygons

Let’s grab a tigris place boundary for Chicago, and add it to our map:

options(tigris_use_cache=TRUE)
chi_boundary <- places(state = "IL") |> filter(NAME == "Chicago")

In this additional function addPolygons, we can adjust the boundary color and fill color:

leaflet() |>
  addProviderTiles(providers$CartoDB.Positron) |>
  setView(lng = -87.636483, lat = 41.862984,  zoom = 10) |> 
  addCircleMarkers(data = airbnb,
                   fillColor = ~pal(room_type),
                   fillOpacity = 1,
                   stroke = FALSE,
                   radius = 1,
                   popup = listing_popup) |>
  addPolygons(data = chi_boundary,
              color = "blue",
              fill = FALSE,
              weight = 1) |> 
  addLegend(
    position = 'topright',
    pal = pal,
    values = airbnb$room_type,
    title = "Room Type"
  )

What we have created so far already resembles Inside Airbnb’s visualization. For the last step, we will add a layer control to turn on and off layer groups as we wish.

Layer Control

baseGroups: We added two more basemaps using addProviderTiles, organizing each as a group named “ESRI World Imagery,” “CartoDB Dark,” and “CartoDB Positron” correspondingly. These group names are passed to the addLayersControl for users to toggle on and off.

overlayGroups: This argument organizes layers on top of the base map. In this case, we added a group argument in both the addCircleMarkers and addPolygons functions, giving each a name, then passed their names to addLayersControl. Similar to the case of basemaps, users can toggle on and off the points and the polygon layers with their respective names.

options: In this example, collapsed = TRUE means that the layers control will initially be collapsed or hidden, showing a cleaner interface.

leaflet() |>
  addProviderTiles("Esri.WorldImagery", group = "ESRI World Imagery") |> 
  addProviderTiles(providers$CartoDB.DarkMatter, group = "CartoDB Dark") |> 
  addProviderTiles(providers$CartoDB.Positron) |>
  setView(lng = -87.636483, lat = 41.862984,  zoom = 10) |> 
  addCircleMarkers(data = airbnb,
                   fillColor = ~pal(room_type),
                   fillOpacity = 1,
                   stroke = F,
                   radius = 1,
                   popup = listing_popup,
                   group = "Airbnb Listings") |>
  addPolygons(data = chi_boundary,
              color = "blue",
              fill = FALSE,
              weight = 1,
              group = "Chicago Boundary") |> 
  addLegend(
    position = 'topright',
    pal = pal,
    values = airbnb$room_type,
    title = "Room Type"
  ) |> 
  addLayersControl(
    baseGroups = c("ESRI World Imagery", "CartoDB Dark", "CartoDB Positron"),
    overlayGroups = c("Chicago Boundary", "Airbnb Listings"),
    options = layersControlOptions(collapsed = TRUE)
  ) 

Chart A: Leaflet Map

This is the complete code for the leaflet map we have created:

listing_popup <-
  paste0(airbnb$name, "<br>",
    "<b>Neighborhood: </b>", airbnb$neighborhood,"<br>",
    "<b>Room Type: </b>", airbnb$room_type, "<br>",
    "<b>Price: </b>", airbnb$price, "<br>",
    "<b>Average Rating: </b>", airbnb$review_scores_rating
  )

pal <- colorFactor(palette = "RdYlGn", domain = airbnb$room_type)

options(tigris_use_cache=TRUE)
chi_boundary <- places(state = "IL") |> filter(NAME == "Chicago")


leaflet() |>
  addProviderTiles("Esri.WorldImagery", group = "ESRI World Imagery") |>
  addProviderTiles(providers$CartoDB.DarkMatter, group = "CartoDB Dark") |>
  addProviderTiles(providers$CartoDB.Positron) |>
  setView(lng = -87.636483, lat = 41.862984,  zoom = 10) |> 
  addCircleMarkers(data = airbnb,
                   fillColor = ~pal(room_type),
                   fillOpacity = 1,
                   stroke = F,
                   radius = 1,
                   popup = listing_popup,
                   group = "Airbnb Listings") |>
  addPolygons(data = chi_boundary,
              color = "blue",
              fill = FALSE,
              weight = 1,
              group = "Chicago Boundary") |> 
  addLegend(
    position = 'topright',
    pal = pal,
    values = airbnb$room_type,
    title = "Room Type"
  ) |> 
  addLayersControl(
    baseGroups = c("ESRI World Imagery", "CartoDB Dark", "CartoDB Positron"),
    overlayGroups = c("Chicago Boundary", "Airbnb Listings"),
    options = layersControlOptions(collapsed = TRUE)
  ) 

Introduction to flexdashboard

Crafting a fundamental flexdashboard does not require additional knowledge such as Javascript, and the process of integrating it into a website or a shiny app (which we’ll be learning soon!) is also manageable. Today, our focus is on populating a flexdashboard template to transform a few of our interactive visualizations into a unified dashboard experience.

From your RStudio interface, go to File - New File - R Markdown

In the popped-up window, select “From Template” - “Flex Dashboard”

You have a new opened file that serves as the template of the dashboard. Save this file as “flexdashboard.Rmd”

Go ahead click “Knit” - “Knit to flex_dashboard”. You now should see a blank dashboard. Any code you put into the “Chart A”, “Chart B” and “Chart C” sections will be displayed in their respective spaces.

Create your own dashboard

Just below the YAML header, there is a code chuck named setup and marked include = FALSE. Here you can supply any data related to package loading and data preparation. Make sure you include here any of your scripts related to loading packages and reading data:

Now you only need to identify and isolate the code that we produced today to populate the respective three chart sections. In other words, you should copy all the code we’ve worked through under the heading “Chart A: Leaflet Map”, and paste them in the blank code chunk under “Chart A”. Then copy and paste all the codes under “Chart B: Median Room Price by Type” and “Chart C: Donut Chart of Last Reviews” to the flexdashboard sections Chart B and Chart C.

When you are ready, knit the document again, and a dashboard should appear!

Exercise

Please choose another U.S. city, and create a flexdashboard for its Airbnb listings.

Map: Your dashboard should include a leaflet map with basemap, polygon, and points layers. (If you follow the same data-obtaining and cleaning process, your dataset should be similar to our example, and you should be able to reuse most of the code with minor changes).

Charts: You are free to create any charts based on your preference. Here is the list of the descriptive questions we have generated in class!

Of course, there are a number of things we can do to make our dashboard look even better. Feel free to refer to the following resources to figure out things like how to choose another layout, how to supply chart titles and adjust chart width, how to add a tabset, etc.

Work Product

Let’s publish your work to Rpubs! On the top-right corner of your R working panel, you will find a blue Publish icon called “publish the application or document”. Click that and you will be directed to this screen below:

Click RPubs - then in the next screen, you will need to either sign in (if you have used Rpubs before), or create a new account.

At the last step, you can title your project, and choose an url string you would like to use to identify your project.

Click Continue. Then after just a few seconds, you can visit your dashboard using your URL in a web browser!

Please submit your flexdashboard.Rmd file and the link to your RPubs dashboard to Canvas. You can paste your link into the submission comments area. Please upload your report to Canvas by the end of day, Tuesday, Nov 21.